# Variational autoencoder

Pulaski ProbUNet3D Base VSeg
Apache-2.0
PULASki is a computationally efficient biomedical image segmentation tool that accurately captures variability in expert annotations, particularly suitable for small datasets and class imbalance issues.
Image Segmentation
P
soumickmj
14
0
Nepali Male V1
Apache-2.0
Nepali male voice synthesis model based on VITS architecture, supporting high-quality text-to-speech functionality
Speech Synthesis Transformers Other
N
tuskbyte
78
0
Vits Cmn
Apache-2.0
VITS is an end-to-end text-to-speech model based on adversarial learning and conditional variational autoencoder, supporting Chinese speech synthesis.
Speech Synthesis Transformers Chinese
V
BricksDisplay
21
4
Mms Tts Mah
Marshallese text-to-speech model developed by Meta, using VITS end-to-end architecture to support high-quality speech synthesis
Speech Synthesis Transformers
M
facebook
124
0
Mms Tts Llg
A Loluo language (llg) text-to-speech model developed by Meta, which is part of the Massive Multilingual Speech project
Speech Synthesis Transformers
M
facebook
4
0
Mms Tts Ljp
The Lampung Api text-to-speech model developed by Meta, which is part of the MMS multilingual speech project
Speech Synthesis Transformers
M
facebook
4
0
Mms Tts Bgr
A Chin and Bam text-to-speech model developed by Meta, part of the Massively Multilingual Speech (MMS) project.
Speech Synthesis Transformers
M
facebook
14
0
Mms Tts Khm
Khmer text-to-speech model from Facebook's MMS project, implemented with VITS architecture for end-to-end speech synthesis
Speech Synthesis Transformers
M
facebook
217
7
Mms Tts Pan
Eastern Punjabi text-to-speech model developed by Facebook, based on VITS architecture, supporting high-quality speech synthesis
Speech Synthesis Transformers
M
facebook
800
2
Mms Tts Pag
Pangasinan text-to-speech model developed by Meta, based on VITS architecture, supporting high-quality speech synthesis
Speech Synthesis Transformers
M
facebook
18
0
Mms Tts Gbm
Garhwali text-to-speech model developed by Meta, supporting high-quality speech synthesis
Speech Synthesis Transformers
M
facebook
18
0
Mms Tts Ilo
Ilocano text-to-speech model developed by Meta, based on VITS architecture, supporting high-quality speech synthesis
Speech Synthesis Transformers
M
facebook
40
0
Mms Tts Swh
Swahili text-to-speech model developed by Meta, based on VITS architecture, supporting high-quality speech synthesis
Speech Synthesis Transformers
M
facebook
161
9
Mms Tts Mal
Malayalam text-to-speech model in Facebook's MMS project, implementing end-to-end speech synthesis based on VITS architecture
Speech Synthesis Transformers
M
facebook
307
2
Mms Tts Hat
Haitian Creole text-to-speech model developed by Meta, part of the Massively Multilingual Speech (MMS) project
Speech Synthesis Transformers
M
facebook
223
1
Vits Vctk
MIT
VITS is an end-to-end speech synthesis model capable of predicting corresponding speech waveforms from input text sequences. The model employs a conditional variational autoencoder (VAE) architecture, including a posterior encoder, decoder, and conditional prior module.
Speech Synthesis Transformers
V
kakao-enterprise
3,601
13
Featured Recommended AI Models
AIbase
Empowering the Future, Your AI Solution Knowledge Base
© 2025AIbase